Aleksei's personal blog

Aleksei's personal blog

Improving dynamodb client performance with aws-go-sdk-v2

Development

Let’s imagine you started using dynamodb and followed steps either from the official documentation or from this very nice third-party wrapper library guregu/dynamo.

If you are using dynamodb, there’s a possibility that you wanted to have a relatively high parallel throughtput. However, in the application I made I experienced some weird issues in this scenario. The symptoms were:

  • High CPU usage.
  • Very high dynamodb latencies when measured from the application (with relatively low amount of worker goroutines trying to make parallel reads).
  • Low usage and latency metrics when observed from the dynamodb table metrics.

Eventually, after taking a CPU profile the flame graph looked like this:

As you can see, the application spends most of the time establishing new HTTPs connections. But why does it do that?

Well, turns out that AWS makes an http call per every single dynamodb query (this is expected). What’s not expected is that usually when you use http.Client you automatically get some pooling logic and connection reuse. This means that you don’t have to make a relatively expensive TCP open, TLS handshake every time you want to read some data.

However, by default (both in go and in aws-go-sdk-v2) one http.Client will only try to keep up to two “idle” connections per host. From the perspective of the http.Client, our dynamodb connection is one host.

Because of that, after a dynamodb query is done, the client checks if connection can be saved for later use, but sees that this host has already exhausted the very low limit. This results in almost no conection reuse in this scenario.

To fix this, raise both MaxIdleConns and MaxIdleConnsPerHost in your client:


// imports
import awshttp "github.com/aws/aws-sdk-go-v2/aws/transport/http"
import "net/http"

// create a custom client
httpClient := awshttp.NewBuildableClient().WithTransportOptions(func(tr *http.Transport) {
    tr.MaxIdleConns = 512
    tr.MaxIdleConnsPerHost = 512
})

// config for creating the client
cfg, err := config.LoadDefaultConfig(context.TODO(), config.WithHTTPClient(customClient))

After doing that, I noticed an instant and significant improvement in performance:

Unfortunately, it’s still not good as it could be (for example if you used ScyllaDB) because of tons and tons of JSON marshalling and unrmashalling (reflect performance is one of the most common golang performance problems).

Hope that helps you and I wish this was already a default when creating dynamodb clients.